Goto

Collaborating Authors

 golden ref



A Closed form Token level Decomposition

Neural Information Processing Systems

The typos do not affect related conclusions. For unsupervised LCG experiments, we use Y elp Reviews (Cho et al., 2018) and WMT News section Please refer to the official website of WMT dataset (Bojar et al., 2017) for more information about For MT experiments, we load the MarianMT from the es-en checkpoint provided by huggingface. All the hyperparameters are tuned on the development set. We simply report the results after the maximum number of training epochs (usually 20). For more implementation details and tricks, please refer to our code.


Controllable Text Generation with Neurally-Decomposed Oracle

Meng, Tao, Lu, Sidi, Peng, Nanyun, Chang, Kai-Wei

arXiv.org Artificial Intelligence

We propose a general and efficient framework to control auto-regressive generation models with NeurAlly-Decomposed Oracle (NADO). Given a pre-trained base language model and a sequence-level boolean oracle function, we propose to decompose the oracle function into token-level guidance to steer the base model in text generation. Specifically, the token-level guidance is approximated by a neural model trained with examples sampled from the base model, demanding no additional auxiliary labeled data. Based on posterior regularization, we present the closed-form optimal solution to incorporate the token-level guidance into the base model for controllable generation. We further provide a theoretical analysis of how the approximation quality of NADO affects the controllable generation results. Experiments conducted on two tasks: (1) text generation with lexical constraints and (2) machine translation with formality control demonstrate that our framework efficiently guides the base model towards the given control factors while maintaining high generation quality.